Some Image Processing: Motion Capure in Video, Noisy Image Restoration with Inverse Filter

The following problems appeared in the exercises in the coursera course Image Processing (by NorthWestern University). The following descriptions of the problems are taken directly from the assignment's description.

In [5]:
#ipython nbconvert pcaiso.ipynb
%matplotlib inline

from IPython.display import HTML

HTML('''<script>
code_show=true; 
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
<form action="javascript:code_toggle()"><input type="submit" value="Click here to toggle on/off the raw code."></form>''')
Out[5]:

1. Analysis of an Image quality after applying an nxn Low Pass Filter (LPF) for different n

The next figure shows the problem statement. Although it was originally implemented in MATLAB, in this article a python implementation is going to be described.

In [6]:
from IPython.display import Image
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\ex1.png', width=800)
Out[6]:

The following figure shows how the images get more and more blurred after the application of the nxn LPF as n increases.

In [39]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\im1.png')
Out[39]:

The following figure shows how the quality of the transformed image decreases when compared to the original image, when an nxn LPF is applied and how the quality (measured in terms of PSNR) degrades as n (LPF kernel width) increases.

In [10]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\im2.png')
Out[10]:

2. Changing Resolution of an Image with Down/Up-Sampling

The following figure describes the problem:

In [38]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\ex2.png', width=800)
Out[38]:

The following steps are needed to be followed:

  1. Smooth the original image with a 3x3 LPF (box1) kernel.
  2. Downsample (choose pixels corresponding to every odd rows and columns)
  3. Upsample the image (double the width and height)
  4. Use the kernel and use it for convolution with the upsampled image to obtain the final image.

Although in the original implementation MATLAB was used, but the following results come from a python implementation.

In [14]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\sampling3.png', width=800)
Out[14]:

As we go on increasing the kernel size, the quality fo the final image obtained by down/up sampling the original image decreases as n increases, as shown in the following figure.

In [17]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\im3.png')
Out[17]:

3. Motion Estimation in Videos using Block matching between consecutive video frames

The following figure describes the problem:

In [37]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\ex3.png', width=800)
Out[37]:

For example we are provided with the input image with known location of an object (face marked with a rectangle) as shown below.

In [22]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\digital-images-week4_quizzes-frame_1_with_block.jpg', width=800)
Out[22]:

Now we are provided with another image which is the next frame extracted from the same video but with the face unmarked) as shown below. The problem is that we have to locate the face in this next frame and mark it using simple block matching technique (and thereby estimate the motion).

In [23]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\digital-images-week4_quizzes-frame_2.jpg', width=800)
Out[23]:

As shown below, using just the simple block matching, we can mark the face in the very next frame.

In [24]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\digital-images-week4_quizzes-frame_2_with_block.jpg', width=800)
Out[24]:

Now let's play with the following two videos. The first one is the video of some students working on a university corridor, as shown below (obtained from youtube), extract some consecutive frames, mark a face in one image and use that image to mark all thew faces om the remaining frames that are consecutive to each other, thereby mark the enntire video and estimatethe motion using the simple block matching technique only.

In [30]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\Week4\\motion_example\\in.gif')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-30-fdadcaf0ec69> in <module>()
----> 1 Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\Week4\\motion_example\\in.gif')

C:\Users\Sandipan.Dey\Anaconda\envs\dato-env\lib\site-packages\IPython\core\display.pyc in __init__(self, data, url, filename, format, embed, width, height, retina, unconfined, metadata)
    731 
    732         if self.embed and self.format not in self._ACCEPTABLE_EMBEDDINGS:
--> 733             raise ValueError("Cannot embed the '%s' image format" % (self.format))
    734         self.width = width
    735         self.height = height

ValueError: Cannot embed the 'gif' image format

The following figure shows the frame with the face marked, now we shall use this image and block matching technique to estimate the motion of the student in the video, by marking his face in all the consecutive frames and reconstructing the video, as shown below.

In [29]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\Week4\\motion_example\\out\\block_001.png', width=800)
Out[29]:
In [36]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\Week4\\motion_example\\out\\out.gif')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-36-4a0f3cfd6a83> in <module>()
----> 1 Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\Week4\\motion_example\\out\\out.gif')

C:\Users\Sandipan.Dey\Anaconda\envs\dato-env\lib\site-packages\IPython\core\display.pyc in __init__(self, data, url, filename, format, embed, width, height, retina, unconfined, metadata)
    731 
    732         if self.embed and self.format not in self._ACCEPTABLE_EMBEDDINGS:
--> 733             raise ValueError("Cannot embed the '%s' image format" % (self.format))
    734         self.width = width
    735         self.height = height

ValueError: Cannot embed the 'gif' image format

The second video is the video of the Google CEO talking, as shown below (obtained from youtube), again extract some consecutive frames, mark his face in one image and use that image to mark all thew faces om the remaining frames that are consecutive to each other, thereby mark the enntire video and estimatethe motion using the simple block matching technique only.

In [32]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\Week4\\pichai\\in.gif')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-32-17de256645da> in <module>()
----> 1 Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\Week4\\pichai\\in.gif')

C:\Users\Sandipan.Dey\Anaconda\envs\dato-env\lib\site-packages\IPython\core\display.pyc in __init__(self, data, url, filename, format, embed, width, height, retina, unconfined, metadata)
    731 
    732         if self.embed and self.format not in self._ACCEPTABLE_EMBEDDINGS:
--> 733             raise ValueError("Cannot embed the '%s' image format" % (self.format))
    734         self.width = width
    735         self.height = height

ValueError: Cannot embed the 'gif' image format

The following figure shows the frame with the face marked, now we shall use this image and block matching technique to estimate the motion of the Googel CEO in the video, by marking his face in all the consecutive frames and reconstructing the video, as shown below.

In [34]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\Week4\\pichai\\out\\block_001.png', width=800)
Out[34]:
In [35]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\Week4\\pichai\\out\\out.gif')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-35-8d36fd9ee8b2> in <module>()
----> 1 Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\Week4\\pichai\\out\\out.gif')

C:\Users\Sandipan.Dey\Anaconda\envs\dato-env\lib\site-packages\IPython\core\display.pyc in __init__(self, data, url, filename, format, embed, width, height, retina, unconfined, metadata)
    731 
    732         if self.embed and self.format not in self._ACCEPTABLE_EMBEDDINGS:
--> 733             raise ValueError("Cannot embed the '%s' image format" % (self.format))
    734         self.width = width
    735         self.height = height

ValueError: Cannot embed the 'gif' image format

The most interesting thing above is that we did not use any sophisticated image features such as HOG, SIFT, still it did a pretty descent job with simple block matching.

4. Using Median Fiter to remove salt and paper noise from an image

The following figure describes the problem:

In [40]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\ex4.png', width=800)
Out[40]:

The following figure shows the original image, the noisy image and images obtained after applying the median filter** of different sizes (nxn, for different values of n*):

In [42]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\im4.png')
Out[42]:

As can be seen from the following figure, the optimal median filter size is 5x5, which generates the highest quality output, when compared to the original image.

In [44]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\im5.png')
Out[44]:

Using Inverse Filter to Restore noisy images with Motion Blurs

The following figure shows the description of the problem:

In [46]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\ex5.png', width=800)
Out[46]:

The following figures show the theory behind the inverse filters in the (frequency) spectral domain.

In [49]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\im6.png')
Out[49]:

The following are the steps for the images with motion blurs using the Inverse Filter:

  1. Generate restoration filter in the frequency domain from frequency response of motion blur and using the threshold T.
  2. Get the spectrum of blurred and noisy-corrupted image (the input to restoration).
  3. Compute Spectrum of restored image by convolving the restoration filter with the blurred noisy image in the frequency domain.
  4. Genrate the restored image from its spectrum.

The following figures show the inverse filter is applied with different threshold value T (starting from 0.1 to 0.9 in that order) to restore the noisy / blurred image.

In [52]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\inverse_filter_0.1.png')
Out[52]:
In [53]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\inverse_filter_0.2.png')
Out[53]:
In [54]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\inverse_filter_0.3.png')
Out[54]:
In [55]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\inverse_filter_0.4.png')
Out[55]:
In [56]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\inverse_filter_0.5.png')
Out[56]:
In [57]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\inverse_filter_0.6.png')
Out[57]:
In [58]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\inverse_filter_0.7.png')
Out[58]:
In [59]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\inverse_filter_0.8.png')
Out[59]:
In [60]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\inverse_filter_0.9.png')
Out[60]:

The following figure shows how the Restored PSNR and the ISNR varies with different values if T.

In [61]:
Image(filename='C:\\courses\\coursera\\Past\\Image Processing & CV\\NorthWestern - Image Processing\\im8.png')
Out[61]:
In [ ]: